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ABSTRACT 


In the analysis of experimental data, many of the distributions 
encountered are the result of combining two or more separate component 
distributions. Estimation in these compound or mixed distributions is 
therefore of particular interest to aerospace scientists. Estimators 
are derived for the parameters of a compound Poisson distribution with, 
probability density function 

-|i x -A ,x 

f(x) = + (1 - a) A, x = 0,1,2,... 

, x , 


and for a compound exponential distribution with probability density 
function 


f(x) = a(l/p)e" x/ ^ + (1 - a) (l/A)e" x/ \ X 1 0 


where a is the proportionality factor (0 s a I 1) and where p. and A are 
component parameters. In addition to the more general case in which all 
parameters must be estimated from sample data, several special cases' are 
considered in which one or more of the parameters are known in advance of 
sampling. 
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FOREWORD 


This report presents results of an investigation performed by the 
Department of Statistics, University of Georgia, Athens, Georgia, as a 
part of NASA Contract NAS8-11175 with the Aerospace Environment Office, 
Aero-Astrodynamics Laboratory, NASA-George C. Marshall Space Flight 
Center, Huntsville, Alabama. Dr. A. C. Cohen, Jr. was the principal 
investigator. The NASA contract monitors are Mr. 0. E. Smith and Mr. 

J. D. Lifsey. 

The results of this study represent a contribution in the area of 
statistical estimation from compound (mixed) frequency distributions. 
The methods presented are straightforward and may be easily adapted to 
practical application. 
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ESTIMATION IN MIXTURES OF POISSON AND 
MIXTURES OF EXPONENTIAL DISTRIBUTIONS 


SUMMARY 


In the analysis of experimental data, many of the distributions 
encountered are the result of combining two or more separate component 
distributions. Estimation in these compound or mixed distributions is 
therefore of particular interest to aerospace scientists. Estimators 
are derived for the parameters of a compound Poisson distribution with 
probability density function 


f(x) = a 



+ (l - a) 



A 


x = 0,1,2,... 


and for a compound exponential distribution with probability density 
function 


f (x) = a(l/n)e” x/ti + (1 - a) (1/A)e' x/A , x ^ 0 


where a is the proportionality factor (0 ^ ct ^ 1) and where p, and A are 
component parameters. In addition to the more general case in which all 
parameters must be estimated from sample data, several special cases are 
considered in which one or more of the parameters are known in advance of 
sampling. 


I. INTRODUCTION 

Many of the distributions encountered in the analysis of experimental 
data are the result of combining two or more separate component distribu- 
tions. Accordingly, estimation in these compound or mixed distributions 
is of particular interest to aerospace scientists. A previous paper [2]' 
dealt with estimation in mixtures of two Poisson distributions; these 
previous results are extended here to jinclude several special cases 
wherein one or more of the parameters of the compound Poisson distribu- 
tion arc known, and in addition analogous estimators are derived for the 
parameters of the compound exponential d is tr ibution. j 

The author wishes to acknowledge the assistance of Mr. Frank Clark 
for his work in establishing the IBM 7094 computer program described in 
Section IV and the Appendix. 


II. MIXTURES OF TWO POISSON DISTRIBUTIONS 


1 . The Probability Density Function 

The probability density function of a compound distribution composed of 
two Poisson components with parameters p and A, respectively, combined 
in proportions a and 1 - oc may be written as 


f(x) = a 



+ (1 - 


a) 




' x = 0, 1,2,... 

o g a s l 


(l) 


For convenience and without any loss of generality, we assume p > A. 


2. Three-Moment Estimators 

The following estimating equations result from equating the first 
three factorial moments of a sample of size n to the corresponding 
theoretical moments. 


a 


_ (x - A) 

(n - A) 


\ 


xe - r = v 


[ 2 ) 


x(e 2 - r) - r0 = v 


[3] 


where 


0 = p + A and r = pA, 


and where the sample factorial moment is given by 


[k] 


x(x-l) ... (x-k + 1) , 


x=0 


( 2 ) 


( 3 ) 


(*> 
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in which R is the largest observed (sample) value of the random variable 
x, n x is the sample frequency of x, and 


R 



x=0 


For simplicity of notation, x has been written in place of v [ x ] for the 
first sample factorial moment. 

On solving the last two equations of (2) simultaneously for T and 0, 
it follows that 


v _ - x v r 

n* = 111 M 

e . x 2 


V r 1 - x" 

[ 2 ] 


r* = xe* - v 


[2] 


(5) 


where the asterisk (*) distinguishes estimators from the parameters being 
estimated. The required estimators of p, and A follow as 



e* + nTg* 2 - 4r* 





\ 




/ 


( 6 ) 


These estimators are the two roots r x and r 2 of the quadratic equation 


Y 2 - e*Y + r* = 0 , (7) 

where p* = r-,^ and A* = r 2 , (r x > r 2 ). The proportionality _ parameter a 
is estimated from the first equation of (2) as o? v = (x - A’")/ (p. w - A ). 
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m 



The estimators given in equation (6) were originally derived by 
Rider [3], but he employed ordinary , rather than factorial moments with 
the result that his derivations were somewhat complicated and his expres- 
sions for 0* and r* were more involved than those given here* 


3, Estimators Based on the First Two Sample Moments and the Sample 
Zero-Frequency 

It is well known that the higher sample moments are subject to 
appreciable sampling error, and in an effort to reduce errors from this 
source, the estimating equation based on the first two sample moments and 
the sample zero-frequency, was derived [1] as 


x - A 
G( A) - A 


n^/n - e’^ 

- e- A ’ 


( 8 ) 


in which 


G(A) = 


v - xA 

[ a . l 


X 


A 


(9) 


where n Q is the sample zero-frequency . Equation (8) can be solved for 
A** u§ing standard iterative procedures and, with A** thus determined, 
estimators of p, and a follow as 


V- 


v .»l - 8 ** 


x - A‘ 


irk 


d** 


X - A** 


) 


( 10 ) 


The double asterisk (**) distinguishes these estimators from the three- 
moment estimators and in turn from the parameters being estimated. 
Unfortunately, no simple procedure for solving equation (8) has been 
devised. However, a computer program based on iterative procedures 
described by Whittaker and Robinson [4, Chap. VI] has been developed 
(see Appendix) to solve equation (8), using as a first approximation 
the three-moment estimate of A given by equation (6). 
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4. Estimation With Some Parameters Specified 


a . g Known 


In this case, we need only estimate jjl and A; for this purpose, 
the first two equations of (2) may be written as 


a = 


X - A 
p - A 


x(p + A) - pA = v 


[2] 


( 11 ) 


where 0 and F have been replaced by their defining relations as given 
in equation (3) . 

With a known, we obtain the following quadratic equation in A 
from the two equations of (11): 


x 2 - av 


A 2 - 2xA + 


1 - a 


Ld = 


0. 


( 12 ) 


On solving equation (12) 


A* = x 


a(v [2] - x 2 ) 

l - a 


and from the first equation of (11) 


(13) 


(I* = [X - A* (l - a)] /a. 


(14) 
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b. a and u Known 


In this case, A may be estimated from the first equation of (11) as 


_ x ■ au 

A " TTa • 


(15) 


c. a and A Known 

In this case, it follows from equation (11) that 
_ X - (1 - «)A 


(16) 


d. u Known 

In this case, we may employ equations (11) to estimate OC and A. 
Accordingly, from the second equation of (11) 


a* 



9 


and from the first equation of (11) 


ot = 


x_ 

M- 


A* 

7 ? ' 


(17) 


(18) 


e. A Known 

In this case, the second equation of (11) gives 



9 


(19) 


6 


V 



and from the first equation of (11) 


a* - 


x - A 

M* - A • 


( 20 ) 


III. MIXTURES OF TWO EXPONENTIAL DISTRIBUTIONS 


1 , The Probability Density Function 

In many respects the exponential distribution may be thought of as 
a continuous analog to the discrete Poisson distribution. In any event, 
lestimating equations in mixtures of two exponential distributions quite 
closely parallel the estimating equations considered in Section II for 
mixtures of- two Poisson distributions. / Consider a compound exponential 
distribution with probability density function 


f(x) = a(lMe- x/ H + (1 - a)(l/A)e" x ^ 


/ 

x s 0 

^ p > A > 0 
0 ^ a ^ l 

\ 


( 21 ) 


The nonessential restriction that p > A is imposed as a matter of con- 
venience and without any loss of generality. 

The noncentral moment of x is 


= j x k f(x) dx = k 1 . [aM* k + (1 


a) A k ] 


( 22 ) 


Accordingly, the first three noncentral moments are 

\ 


m i = Q!fi + (1 - Qt) A 


m^ = 2 [a n 2 + (1 - a) A 2 ] \ 


m^ = 6[aia 3 + (1 - a.) A 3 ] 


( 23 ) 
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2. Three-Moment Estimators 


When the first three noncentral sample moments, designated v^, v' 2 
and V 3 , respectively, with = x, are equated to the theoretical 
moments of (23), we obtain the estimating equations 


x - A = a (pi - A) 


\ 


V -f - A 2 = a(p 2 - A 2 ) } • 


- A 3 = a([i 3 - A 3 ) 


(24) 


These equations differ from the corresponding equations for mixed Poisson 
distributions only in that y* I 2 and vl / 6 have replaced the factorial 
moments v [ 2 ] an< ^ v [ 3 ] t ^ ie Tn ^ xe d Poisson distribution. 

On eliminating a between the first and second and between the first 
and third equations of (24), we simplify to obtain 


(25) 


last two equations of ( 2 ) in the 
Here, as in the Poisson case, 0 
and r are defined by equation (3). Accordingly, on solving the two 
equations of (25) simultaneously, we have as estimators of 0 and T 


x0 - r = 


t 

V 2 


x(e 2 - r) 


re = 


i 

la 

6 


which are completely analogous to the 
case of mixed Poisson distributions. 


e 


* 


i 





. 


r* = xe* 





(26) 


which are analogous to equation (5) for the mixed Poisson distribution. 
8 



Finally, with 9* and r* determined from (26), and .A* follow 
from equation (6) as in the Poisson case, and g* follows from the first 
equation of (24) as 


ot = 



(27) 


3. Estimation With Some Parameters Specified 
a. g Known 

We need only replace vr 2 i with v^/2 and the quadratic equation 
of (12) becomes, for the present case, 

£2 _ a zk 

A 2 - 2xA + y = 0. (28) 

Accordingly, 



)’ (29) 


* x - A* (1 - a) 

n 


b. g and u Known 

In this case, the estimator for A follows from the first equa- 
tion of (24) as 


A* = , (30) 

i - a ’ 

which is identical with the corresponding estimator, equation (15), in 
the Poisson case. 


% 
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c. a and A Known 


In this case, it follows from the first equation of (24) that 


* _ x - (1 - a) A 

^ rv 


(31) 


d . A Known 


In this case, we need only replace v [ 2 ] in equation (19) with 
y^/2 and, accordingly, 


M* = 


I 

~ “ xA 

X - A 


O'* = 


A* 


H - 


A* 


IV. COMPUTATIONAL PROCEDURES 


(33) 


The solution of the transcendental estimating equation (8) from 
Section II provides an interesting illustration of iterative numerical 
computational techniques described by Whittaker and Robinson (loc, cit.). 
To facilitate solution of equation (8), the denominator of the left side 
is interchanged with the numerator of the right side, and the resulting 
equation becomes 


x - A 


n 0 /n - 


-A 

e 


G(A) - A 

-G(A) -A * 

e - e 


(34) 


where G(A) remains as 


given by equation 


(9). 
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* 


A** 




Aq 


We begin with an initial approximation Aq and iterate toward the. 
value A** as described by Whittaker and Robinson [4, pp. 81-83]. The 
three-moment estimate of A given by equation (6) of Section II provides 
a satisfactory value for Aq. This initial approximation is substituted 
into the second equation of (35) to obtain Rq, which is merely an 
abbreviated notation for R(A 0 )« We then solve the equation. 


L(A X ) = Rq 


(36) 
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to obtain A^, the next approximation. This cycle is repeated as many 
times as necessary to attain the desired' degree of accuracy. Equation 
(36) is itself a transcendental equation, though somewhat simpler in 
form than the original equation (34). It is amenable to solution by 
the Newton-Raphson method [4, pp. 84-86]. For the ith cycle of itera- 
tion, the equation corresponding to (36) becomes 


x - A. 

L »i> ■ — — bq - R i-i- 

n p /n - e 1 


(37) 


which may be written as 


f(A.)= 0, 


where 


f(A.) = 


A. 

i 


R. 

1-1 


-A± 

e 1 - 


1-1 


(38) 


and 


c i_ x = ( 5 " n Q / n ). 

Equation (37) may be readily solved using the Newton-Raphson method, 
where A i:r+1 , the (r + l) st iterant to A i5 is given by 

f(A i:r ) 

\:r+i " A i:r “ f' (A^)* 


The first derivative of f(Ai) follows from equation (38) as 


f' (A.) = 1 + R t-1 e 


“A* 
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Accordingly, 


A. - A. 

i:rrt*i i:r 


-Vr 

^i:r- R i-x e 


1 + R i-i e 


-Ai 


(39) 


A's an initial approximation A^. q to Ai, it will usually be satis- 
factory to let Ai ; o = Aiii x . The Newton- Raphs on iterative technique is 
continued through as many cycles as necessary to attain the desired 
accuracy in Ai. More specifically, this procedure is terminated at the 
end of the rth cycle, the first cycle for which 




< 5 X » 


where 5 X specifies the maximum permissible absolute value deviation. 

With Ai thus determined, we calculate Ri, set up the new equation 

L <W - V 

and continue the primary routine through k cycles. The kth cycle is the 
first for which 


\\ - \\ < 82. 


(40) 


where 5 2 specifies the maximum allowable absolute value deviation. The 
required estimate of A. is then 



V. ILLUSTRATIVE EXAMPLES 


1. Mixed Poisson Distribution 


To illustrate the application of his three-moment estimators. 

Rider [3] chose an example constructed by mixing equal proportions of 
two Poisson distributions with jj. = ,1.5 and A = 0.5, respectively. These 
data are as follows: 


X 

0 

1 

2 

3 

4 

5 

6 

7 

n x 

830 

638 

327 

137 

49 

15 

3 

1 


In summary, n = 2,000, n Q = 830, x = 0.9995, vr 2 i = 1.243 and vr 3 i = 1 . 734 . 
Direct substitution of these values into equations (5) and (6) yields the 
three-moment estimates 

p* « 1.4766563, 

A* - 0.47765894, 
d* = 0.52236479. 


The above results differ slightly from those given by Rider due, 
apparently, to small round-off errors in his calculations. 

Estimates based on the first two sample moments and the sample zero- 
frequency, calculated by a computer program of the routine described in 
Section IV, are 


ick 

M- 


1.4936, 


A** = 0.4956, 


d** = 0.5049. 
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These estimates are in much closer agreement with the actual population 
parameters p = 1.5, A = 0.5, and a - 0.5 than the three -moment estimates. 
Investigations are continuing with regard to the relative efficiency of 
the three-moment and the two-moment plus zero- frequency estimates; but 
at least in the present instance, where a large proportion of the popula- 
tion is in the zero class, the two-moment plus zero-frequency estimates 
seem to be preferred. 


2 . Mixed Exponential Distribution 

To illustrate the application of estimators derived in this case, a 
sample of 2000 observations was selected from a mixed population constructed 
by combining two exponential distributions with p = 2, A = 1 , and o: ■ 0 . 4 . 
Data for the sample selected are summarized as follows: n = 2,000, 
x = 1.42, v' 2 = 4.38, and - 21.6. 

Direct substitution of these data into equations (26), (6), and 
(27) yields as three-moment estimates: 

p* = 1.85, 

A* = 1.02, 
cf v = 0.48. 
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APPENDIX 


FIND - A Computer Program 

By 

Frank C. Clark 


FIND is a Fortran IV computer program which calculates estimates for 
the parameters a, jjl» and A of a compound (mixed) Poisson distribution. 
These estimates are calculated from (1) the first three sample moments 
and (2) the first two sample moments and the sample zero-frequency . 

In finding A for the second case, the following equation is solved: 


x - A 

G(A) - A 


n^/n - e 
e ~G(A) _ e -A * 


where 


6(A) = 



x - A 


> 


and v [ 2 ]> x, and n Q /n are known constants. FIND makes use of the Newton- 
Raphsonand geometrical iteration methods [4] in solving the equation. 

FIND requires, for each data sample, input values for x, n D /n, v [ 2 ]> 
and V[ 3 ], punched on a single card. Iteration continues through 
k < 500 cycles until the absolute error of equation (40) is less than 
0,00001, i.e,, until 


ll^ - RjJ < 0.00001. 


T J? i.t. _• _ i ^ ~ 4- 1 r — *>nn flifl m on o n rvo of-£»rl 

JLX. UUb Cl J. L.CL jLCI JU& HU L. met WUCU IV — -'w, ^ ** w ww ** w v 

iterations with no success" is given and the program stops. Should 
greater accuracy be required in the estimate of A, appropriate change 
should be made in the source program card "TOL = .00 ... ." 


♦ 
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FIND prints out the following: 

1. Values of the index, i. 

2. Values of A in the Newton-Raphson iteration. 

3. Values of 

ERROR = TEST 1 - TOL, 
where 

TEST 1 = |L. - R. I . 

1 i:r i-i 1 

4. a, p, and A based on the first three sample moments. 

(This value of A is used as the first approximation in 
the Newton-Raphson process.) 

5. a, |i, and A based on the first two sample moments and the 
sample zero-frequency. 


FIND (FORTRAN IV) 


C ESTIMATION IN MIXTURES OF TWO POISSON DISTRIBUTIONS 

DIMENSION LAM ( 4000 ) 

REAL MU*NUE2*NUE3»N0N»LAM*L*LAMBDA»MU1 

1 READ ( 5 » 2 ) XBAR »NON »NUE2 »NUE3 

2 FORMATI4F10.5) 

THET = (NUE3-XBAR*NUE2)/(NUE2-(XBAR**2> ) 

CLAM = XBAR+THET-NUE2 

MU = ( THET+SQRT (THET**2-4.0*CLAM) )/?.0 

1=1 

LAM ( I )= (THET-SQRT ( THET** 2-4 • 0*CL AM ) ) / 2 - 0 
N = 0 

ALPHA 1 = ( XBAR-L AM ( I ) )/(MU J LAM( I ) ) 

K = 0 

G = ( NUE2-XBAR+LAM ( I ) )/(XBAR-LAM( I ) ) 

N = N + 1 

R = ( G-LAM ( I ) ) / ( EXP ( -G ) -EXP ( -LAMl I ) ) ) 

9 C = ( XBAR-NON+R ) 

10 K=K+1 

LAM ( 1 + 1 ) = LAMl I )-( ( LAM ( I ) -R*EXP ( —LAM ( I ) ) — C ) / ( 1 .0+R*EXP ( -LAM ( I) ) ) ) 
L = (XBAR-L AMI 1 + 1) )/(NON-EXP(-LAM( 1+1) ) ) 

TOL = .00001 
TEST1 = ABS(L-R) 


60 FORMATI1H *I5»5X*E15.8*E15.8) 

ERROR = TEST 1 - TOL 
WRITE (6 >60) I * LAM ( I ) > ERROR 
IF (TEST1-TOL)20»15*15 

23 1=1+1 

GO TO 10 

20 G = (NUE2-XBAR*LAM( 1+1 ) ) / ( XBAR-LAM ( 1+1 ) ) 

R = (G-LAM( 1+1 ) ) /( EXP(-G)-EXP(-LAM( 1+1 ) ) ) 
TEST? = ABS(L-R) 

IF (TEST2-TOL)30*25»25 

24 1=1+1 
K = 0 

GO TO 9 


15 I F ( 500— K ) 22 » 22 » 23 
25 I F ( 500-N)22*22*24 
22 WR I TE ( 6 *28 ) 

28 FORMAT (42H1COMPLETED 500 ITERATIONS WITH NO SUCCESS) 
GO TO 100 

30 MU1 = (NUE2-XBAR*LAM( 1+1 ) )/(XBAR-LAM( 1+1 ) ) 

LAMBDA = L AM ( 1 + 1 ) 

ALPHA2 = (XBAR-L AMI 1 + 1 ) ) / ( MU1 —LAM ( I + 1 ) ) 


WR I TE ( 6 * 50 ) 

50 FORMAT ( 39H1EST IMATES BASED ON FIRST THREE MOMENTS) 

WRITE ( 6*51 )MU*LAM( 1 ) , ALPHA 1. 

51 FORMAT ( 1 OHO MU = E15.8*1CH LAMBDA = EI5.8.9H ALPHA = E15.8) 
WRITE(6>52) 

52 FORMAT ( 74H0EST IMATES BASED ON FIRST TWO SAMPLE MOMENTS AND THE ZER 
10 SAMPLE FREQUENCY) 

WRITE(6*53) MU1 » LAMBDA » ALPHA 2 

53 FORMAT ( 1 1H0 MU = E15.8*10H LAMBDA = E15.8*10H ALPHA = E15.8) 

GO TO 1 

. 100 STOP 

END 
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